Inferring Decision Trees Using the Minimum Description Length Principle
نویسندگان
چکیده
This paper concerns methods for inferring decision trees from examples for classification problems. The reader who is unfamiliar with this problem may wish to consult J. R. Quinlan’s paper (1986), or the excellent monograph by Breiman et al. (1984), although this paper will be self-contained. This work is inspired by Rissanen’s work on the Minimum description length principle (or MDLP for short) and on his related notion of the stochastic complexity of a string Rissanen, 1986b. The reader may also want to refer to related work by Boulton and Wallace (1968, 1973a, 1973b), Georgeff and Wallace (1984), and Hart (1987). Roughly speaking, the minimum description length principle states that the best “theory” to infer from a set of data is the one which minimizes the sum of
منابع مشابه
Inferring Reduced Ordered Decision Graphs of Minimum Description Length
We propose an heuristic algorithm that induces decision graphs from training sets using Rissanen's minimum description length principle to control the tradeoo between accuracy in the training set and complexity of the hypothesis description.
متن کاملContext Maximizing : Finding MDL Decision Trees
We present an application of the context weighting algorithm. Our objective is to classify objects with decision trees. The best tree will be searched for with the Minimum Description Length Principle. In order to find these trees, we modified the context weighting algorithm.
متن کاملCausal Inference on Multivariate and Mixed-Type Data
Given data over the joint distribution of two random variables X and Y , we consider the problem of inferring the most likely causal direction between X and Y . In particular, we consider the general case where both X and Y may be univariate or multivariate, and of the same or mixed data types. We take an information theoretic approach, based on Kolmogorov complexity, from which it follows that...
متن کاملCausal Inference on Multivariate Mixed Type Data
Given data over the joint distribution of two univariate or multivariate random variables X and Y of mixed or single type data, we consider the problem of inferring the most likely causal direction between X and Y . We take an information theoretic approach, from which it follows that rst describing the data over cause and then that of eect given cause is shorter than the reverse direction. F...
متن کاملAttribute Value Selection Considering the Minimum Description Length Approach and Feature Granularity
In this paper we introduce a new approach to automatic attribute and granularity selection for building optimum regression trees. The method is based on the minimum description length principle (MDL) and aspects of granular computing. The approach is verified by giving an example using a data set which is extracted and preprocessed from an operational information system of the Components Toolsh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Comput.
دوره 80 شماره
صفحات -
تاریخ انتشار 1989